Structural insights into inhibition of PRRSV Nsp4 revealed by structure-based virtual screening, molecular dynamics, and MM-PBSA studies

Background Porcine reproductive and respiratory syndrome respiratory sickness in weaned and growing pigs, as well as sow reproductive failure, and its infection is regarded as one of the most serious swine illnesses worldwide. Given the current lack of an effective treatment, in this study, we identified natural compounds capable of inhibiting non-structural protein 4 (Nsp4) of the virus, which is involved in their replication and pathogenesis. Results We screened natural compounds (n = 97,999) obtained from the ZINC database against Nsp4 and selected the top 10 compounds for analysing protein–ligand interactions and physicochemical properties. The five compounds demonstrating strong binding affinity were then subjected to molecular dynamics simulations (100 ns) and binding free energy calculations. Based on analysis, we identified four possible lead compounds that represent potentially effective drug-like inhibitors. Conclusions These methods identified that these natural compounds are capable of inhibiting Nsp4 and possibly effective as antiviral therapeutics against PRRSV.


Introduction
Porcine reproductive and respiratory syndrome virus (PRRSV) infection is an economically important disease in swine and accountable for significant losses to the pork industry worldwide [1]. PRRSV is an enveloped, single-stranded RNA virus of the genus Arterivirus [1][2][3][4] that causes respiratory disease and is responsible for severe reproductive failure in pigs [4]. Generally, the disease is further complicated by secondary infection and results in a high mortality rate [1]. The available treatments, including vaccines, poorly control the disease, owing to the high genetic and antigenic heterogeneity of the virus [5]. Antivirals might be useful in controlling and managing PRRSV, and recent studies have reported the ability of herbal extracts to inhibit PRRSV infection [6][7][8]. Many compounds derived from natural sources such as plants have shown inhibitory activity against viruses and a wide variety of pathogens [9,10]. Several natural compounds have demonstrated antiviral activity (for instance, immense potential for inhibiting viral replication) and drug-like activity [10]. The identification of potent inhibitors of PRRSV from natural sources is challenging, as it requires compound extraction and evaluation of antiviral activity, both of which require funds and specific experimental equipment. Given the difficulty in screening large numbers of natural compounds, chemo-informatics approaches are useful for identifying lead compounds from databases by targeting essential viral proteins [11,12].
Open reading frame (ORF)1a and ORF1b comprise~80% of the PRRSV genome and respectively encode pp1a and pp1b polyproteins that are cleaved by viral proteases into non-structural proteins [13]. Papain-like proteases (PL1pro and PL2pro) and a 3C-like serine protease [3CLSP; nonstructural protein 4 (Nsp4)] are the viral proteases involved in polyprotein cleavage and required for Arterivirus replication. Additionally, the reported involvement of Nsp4 in interferon inhibition is linked with PRRSV pathogenesis, suggesting it as a promising molecular target for novel therapeutics [14][15][16].
The goal of this study was to identify natural compounds capable of serving as novel inhibitors of PRRSV replication via their targeting of Nsp4. Specifically, we aimed to identify natural compounds through structurebased virtual screening, analyse their physicochemical properties, perform molecular dynamics (MD) simulations, and determine the binding affinities of the potential inhibitors using molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) methods.

Retrieval and preparation of ligand structures
We retrieved 97,999 compounds from a subset of the ZINC database housing natural compounds [17]. All natural compounds were downloaded in the structure-data file format, and these files were subsequently converted to AutoDock PDBQT [Protein Data Bank (PDB), partial charge (Q), and atom type (T)] files using OpenBabel software (https:// openbabel.org/wiki/Main_Page). These files were then used for interaction studies with target protein through structurebased virtual screening and molecular docking using Auto-Dock vina [18][19][20].
Retrieval and preparation of the target protein structure The crystal structure of PRRSV Nsp4 (PDB ID: 5Y4L) was retrieved from RCSB-PDB and visualised and analysed using UCSF Chimera-1.15 software [21,22]. An AutoDock tool was used for the addition of partial atomic charges (Kollman charge), hydrogen atoms, generation of gridbox, and preparation of the Nsp4 structure. The grid box was generated with centre (X = − 4.034, Y = 6.285, Z = 17.57) and size (X = 46, Y = 44, Z = 42) coordinates that were defined in a configuration file (exhaustiveness and energy ranges: 8 and 4, respectively). The prepared structure was saved in the PDBQT file format for molecular docking [23].

Structure-based virtual screening
Virtual screening is a computational method widely used for the identification of lead molecules by docking large numbers of compounds with a molecular target of interest to allow evaluation of the binding free energy of the docked/screened compounds using drug discovery programs [24]. AutoDock vina is a molecular docking and virtual screening program that determines the preferred relative orientation of a ligand during docking or interaction with a molecular target and provides a stable protein-ligand complex structure that exhibits a minimum binding energy [20]. Here, we used AutoDock vina to screen the retrieved natural compounds against PRRSV Nsp4 and generated protein-ligand complexes of the top 10 screened compounds using PyMol (https://pymol.org/2/). Two-dimensional (2D) models of the complexes were visualised using Discovery Studio Visualizer (https://discover.3ds.com/discovery-studiovisualizer-download) to determine the amino acid residues involved in the interactions [25].

Drug-likeness analysis
A total of seven principal descriptors were included to evaluate the drug-likeness of the top 10 screened natural compounds. These included molecular weight (MW), logP value, status as a hydrogen-bond donor (HBD), and acceptor (HBA), polar surface area (2D; PSA), polarizability (P), and van der Waals surface area (VWSA). MW, logP, HBD, and HBA were obtained from the ZINC database [17], whereas PSA, P, and VWSA were calculated using the MarvinSketch software (https://chemaxon.com/products/marvin).

MD simulation
The MD simulations were performed using GROMACS (v.2018.1; http://www.gromacs.org/) for stability predictions of the Nsp4-ligand complexes [26,27]. Six systems were generated and subjected to 100 ns MD simulations-one system estimated the stability of Nsp4 and the other five estimated the stability of the Nsp4-ligand complexes. The dynamic nature of the target protein and the docked-ligand complex was predicted in the presence of a solvent. All six systems were solvated in a box using a simple point-charge model. The topology of the ligands was created using ProDRG [28] and that for Nsp4 was generated using the GROMOS9653a6 force field [29]. The systems were neutralised by adding 1 Na + ion. To eliminate steric hindrance, the steepest energy minimization was used for all systems in order to obtain the maximal force below 1000 kJ/mol/nm. Long-range electrostatic interactions were determined using the particle mesh Ewald (PME) method [30]. For computation of Lennard-Jones and Coulomb interactions, we used a radius cut-off of 1.0 nm; the LINCS algorithm was used to constrain H-bond lengths [31]. All simulations applied a consistent time step of 2 fs. Short-range nonbonded interactions were predicted using a 10-Å cut-off distance, whereas long-range electrostatics were predicted using the PME method with 1.6-Å Fourier grid spacing. Shake algorithms were used to fix all bonds, including H-bonds [32]. After energy minimisation, the systems were equilibrated, followed by position-restraint simulations under NVT and NPT conditions to maintain the volume, temperature, and pressure. Finally, all systems were subjected to a 100 ns MD simulation; coordinates were saved at 2 fs intervals. Root-mean-square deviation (RMSD), root-mean-square fluctuation (RMSF), radius of gyration (Rg), solvent-accessible surface area (SASA), Hbonds, and Gibbs free energy landscapes were calculated, and principal component analysis (PCA) was performed to predict correlated motions generated during proteinligand interactions;'gmx rms', 'gmx rmsf', 'gmx gyrate', 'gmx sasa', 'gmx hbond', 'gmx sham', and 'gmx covar' (GROMACSv.2018.1; http://www.gromacs.org/), respectively, were used for these purposes. The resulting files were analysed and visualised using xmgrace (https:// plasma-gate.weizmann.ac.il/Grace/).

Binding-energy calculation
MM-PBSA is a widely used and well-accepted method for calculating the binding free energy of protein-ligand complexes [33]. Here, we used the g_mmpbsa tool (https://rashmikumari.github.io/g_mmpbsa/) to calculate the binding energy by integrating high-throughput MD simulation data [34]. The binding energy calculations can be described in the following equation: Here, van der Waals and electrostatic interaction were calculated in molecular mechanics energy (ΔG mm ). ΔG ps and ΔG nps are the polar and non-polar solvation energies, and TΔS is refer to the entropic contribution where temperature and entropy are denoted by T and S, respectively. The average binding energies of the top five protein-ligand complexes and amino acid residues contributing to the binding activity were calculated by using 'python' scripts included in the g_mmpbsa tool.

Identification of lead compounds through structure-based virtual screening
Structure-based virtual screening enables the prediction of optimal interactions between ligands and a macromolecular target for complex formation. The ligands are subsequently sorted according to their binding free energy for the target. This requires the three-dimensional structure of the target, with the compounds obtained from a database and categorised according to their affinity. In the present study, we downloaded a subset of natural compounds (n = 97,999) from the ZINC database for virtual screening against PRRSV Nsp4. We subsequently identified the top 10 compounds sorted according to their minimum binding free energy (range:− 10.0 to − 9.2 kcal/mol) for further analyses (Table 1).

Analysis and visualisation of the screened compounds
The active site residues of Nsp4 include His39, Asp64, and Ser118, as well as His133 and Ser136 that are reportedly essential for protein activity. The compound showing the optimal binding free energy (− 10.0 kcal/mol; ZINC38167083) demonstrated ligand interactions such as His39-mediated van der Waals interactions and Asp64mediated Pi-anion interaction. Fig. 1 shows protein-ligand H-bond interactions involving the active site residues His39, Ser118, and Ser136, besides, Asp64 and His133 is interacted with van der Waals and pi-anion interaction with ZINC08877407. The interacting amino acid residues of top 10 compounds are shown in Table 1.

Assessment of drug likeness through physicochemical property analysis
Physicochemical property analysis is one of the fundamental tasks in any drug discovery program. The top 10 screened compounds were then subjected to analysis of their physicochemical properties according to 7 principal descriptors (MW, logP value, status of HBDs and HBAs, 2D PSA, P, and VWSA). According to a previous study, a good drug should have an MW < 500 Da, an HBD < 5, and an HBA < 10. The MW, logP, HBD, and HBA of the selected compounds met the Lipinski's rule. Additionally, PSA, P, and VWSA results displayed drug-like behaviour. (Table 2).

MD simulation analysis
The structure of Nsp4 and top five screened compoundcomplex with Nsp4 was employed for 100 ns MD simulation study for predicting the dynamic changes during protein-ligand interaction and their nature of stability. The present study included various parameters i.e. RMSD, RMSF, Rg, SASA, H-bond, PCA, Gibbs free energy landscape, and binding free energy calculation.

Structural deviation analysis through RMSD
The RMSD value describes the dynamic behaviour among native structures to a new pose. After a 70 ns of simulation to obtain a stable trajectory, the RMSD values were 0.35, 0.25, 0.29, 0.23, 0.38, and 0.39 nm for Nsp4, Nsp4-ZINC38167083, Nsp4-ZINC16919178, Nsp4-ZINC08792350, Nsp4-ZINC01510656, and Nsp4-ZINC08877407, respectively. These data suggest that Nsp4-ZINC08792350 and Nsp4-ZINC38167083 are highly stable complexes relative to the others. Because each Nsp4-compound complex demonstrated stability after the 70 ns simulation, we performed further evaluations on each for last 30 ns trajectory ( Fig. 2A).

Interaction analysis through hydrogen bonding
Hydrogen bonding is the most important bond for stabilizing protein-ligand interactions. The average number of hydrogen bonds for the complexes Nsp4-ZINC38167083, Nsp4-ZINC16919178, Nsp4-ZINC08792350, Nsp4-ZINC01510656, and Nsp4-ZINC08877407 over the final 30 ns of the simulations was 0-1 and that for Nsp4-ZINC38167083 and Nsp4-ZINC16919178 was 0-2 and 0-3, respectively (Fig. 2E). Hence, these compounds interacted with Nsp4 and provided a stable complex during protein-ligand interactions.

Principal component analysis (PCA)
In PCA, the sum of the eigenvalues suggests the overall flexibility of a structure under different conditions. Therefore, the first 5 of 50 eigenvectors used to calculate eigenvalues from the final 30 ns of the simulation were used to determine the percentage change in structural movement. The results revealed that these five eigenvectors accounted for 42.85, 63.97, 63.27, 59.14, 64.83, and 71.05% of the motions for Nsp4, Nsp4-ZINC38167083, Nsp4-ZINC16919178, Nsp4-ZINC08792350, Nsp4-ZINC01510656, and Nsp4-ZINC08877407 respectively (Fig. 3A), suggesting increased movement after the binding of each ligand. Moreover, Nsp4-ZINC38167083, Nsp4-ZINC16919178, Nsp4-ZINC08792350, and Nsp4-ZINC01510656 showed less overall motion relative to Nsp4-ZINC08877407. Additionally, generation of a 2D plot for assessing protein dynamics after ligand binding suggested the overall stability (lowcorrelated motions) of Nsp4, Nsp4-ZINC38167083, and Nsp4-ZINC08792350 (Fig. 3B), indicating these compounds as possible leads for further evaluation as inhibitors.

Gibbs free energy landscape
We then calculated the Gibbs free energy landscape using the first two principal components (PC1 and PC2) in order to visualize the results. Fig. 4 shows the colourcoded plots generated for Nsp4 along with each complex. The lowest free energy values (≤9.08 kJ/mol) were observed for Nsp4-ZINC38167083, suggesting that this complex demonstrated overall thermodynamic stability. The other complexes (Nsp4-ZINC16919178, Nsp4-ZINC08792350, Nsp4-ZINC01510656, and Nsp4-ZINC08877407) had values of to 11.4 kJ/mol, implying that these complexes have numerous high-energy minima.

Binding free energy
We then evaluate the binding free energy associated with each ligand through MM-PBSA using the final 10 ns of the simulation, for calculation of van der Waals and electrostatic interactions, Polar solvation, and SASA.  (Table 3).
The investigation of residual binding energy is a key method for identifying residues important to ligand binding. Fig. 5 shows that amino acid residues at positions 5 to 142 contributed significantly to binding of ZINC38167083, ZINC16919178, ZINC08792350, and ZINC01510656, which are the catalytic residues in the active site. Fewer contacts were observed in relation to ZINC08877407 binding, suggesting that ZINC38167083, ZINC16919178, ZINC08792350, and ZINC01510656 represent potential Nsp4 inhibitors.

Discussion
PRRSV is a recalcitrant and intricate disease in a pig when working as a cofactor in a porcine respiratory disease complex (PRDC) or primary infectious agent. It was  identified as the most frequent virus linked to PRDC [35][36][37][38][39]. Furthermore, PRRSV has been shown to impair the host immune system, which can lead to more serious secondary infections, and chronic disorders [35]. The involvement of Nsp4 in PRRSV replication and pathogenesis is decoded and recommended as one of the key molecular targets for drug development [14]. Therefore, identification of Nsp4 inhibitors is needed to prevent and manage the disease. Natural compounds have made immense contributions in the identification of lead molecule(s) with antiviral potential. It is believed that the disease can be controlled successfully by developing small molecules that can inhibit Nsp4 activity linked with pathogenesis [14]. In the present study, computational approaches are utilized for the identification of possible lead compounds via molecular docking of natural compounds database through structure-based virtual screening followed by downstream analysis. Structure based-virtual screening is a powerful computational approach that is used to investigate important lead molecule(s) from a big set of a compound database based on the lowest binding energy required for stabilizing the protein-ligand complex [40]. From the structure-based virtual screening, we have selected the top ten natural compounds that show interaction with key residues. Further, protein-ligand analysis of the top 5 compounds demonstrated that the ZINC38167083 interacted with Nsp4 and formed one conventional hydrogen bond at position Gly63. Besides, amino acid residues Ser18, His39, Leu41, Thr42, Gly43, Asn44, Thr134, and Thr145 were involved in van der Waals interactions, and Val61 and Ile143 formed alkyl bonds. Ala38 and Phe151 participated in interaction through amide-pi and pi-pi t-shaped bonding. Additionally, Asp64 contributed to interaction through the pianion bond. ZINC16919178 bonded with Nsp4 at position Asn11 by one conventional hydrogen bond. In addition, amino acid residues Phe3, Thr5, Ser9, Leu10, Phe26, Ile123, and Gly127 formed van der Waals interactions; Pro78, Tyr92, Leu94, Val99, and Pro101 formed alkyl and pi-alkyl bonds. ZINC08792350 interacted with Nsp4 Thr5, Ser9, Asn11, Val76, Pro78, Ile123, Phe166, and Asp192 through van der Waals interactions; Phe3 and Tyr92 formed pi-pi t-shaped bonding. In addition, the amino acid residues Leu10, Pro101, Leu196 contributed to interaction through pi-alkyl bonding; Arg90 and Val99 formed pi-anion and pi-sigma bonds with ZINC08792350, respectively. ZINC01510656 bonded with Nsp4 at Thr5, Asn11, Val76, Phe26, Leu94, and Gly127 through van der Waals interactions; amino acid residues Leu10, Pro78, and IIe123 formed pi-alkyl bonds, and Ser9, Tyr92, Val99 contributed in interaction by carbon-hydrogen bonding, pi-pi t-shaped, and pi-sigma interactions, respectively. ZINC08877407 formed conventional hydrogen bonds with Nsp4 at position His39, Ser118, ASP117, and Ser136; amino acid residues Gly63, Asp64, Ala114, Cys115, Gly116, Thr134, Lys138, and Thr145 contributed to protein-ligand interaction through van der Waals. Additionally, Gly135 formed a carbon-hydrogen bond, and His133, Ile143, and Phe151 interacted through pi-anion, pi-alkyl, and pi-sigma bonding, respectively. Medicinal chemists have traditionally been interested in noncovalent interactions that are indicative of attraction, directed intermolecular forces in Fig. 5 Plot depicting the amino acid residues of Nsp4 contributing to the binding with natural compounds. Red, green, blue, orange, and violet colours represent ZINC38167083, ZINC16919178, ZINC08792350, ZINC01510656, and ZINC08877407, respectively their search for the "glue" that keeps ligand and their molecular target together. In recent years, with the rapid increase in the number of solved biomolecular structures and the performance enhancement of computational methods, it is now possible to provide a more thorough understanding of protein-ligand interaction [41]. Therefore, based on the results, it was concluded that the screened compounds can inhibit the virulence activity of Nsp4 [14]. Besides, the results of physicochemical properties prediction suggest that the screened compounds demonstrated good drug-like behavior and could be considered for further analysis [42,43]. Therefore, 100 ns MD simulation analysis was conducted for Nsp4 and top 5 natural compounds i.e. ZINC38167083, ZINC16919178, ZINC08792350, ZINC01510656, and ZINC08877407, respectively with Nsp4 to evaluate the dynamic behavior of protein and protein-ligand complexes. It is recognized as a powerful approach for predicting the conformational stability of macromolecules before and after ligand binding, besides the simulated data can be utilized for calculation of real binding energy of small molecules concerning time along with a contribution of binding amino acid residues present in the macromolecular target [44]. Several structural parameters were calculated, including RMSD, RMSF, Rg, SASA, H-bonding, PCA, and gibbs free energy [45][46][47][48]. The RMSD value indicated that all of the complexes were stable and creating an equilibrated trajectory for further investigation. As a result, we determined RMSF, Rg, SASA, PCA, and Gibbs free energy to determine the nature of each system subjected for MD simulation. Drug selectivity, metabolization, and stability all require Hbonds. To better understand the H-bond and its contributions to the overall stability of each system, an Hbond analysis of natural compounds-Nsp4 complexes were calculated. The hydrogen bonding study indicates that all of the Nsp4-complexes are stable and made bonding with essential catalytic residues [49]. The overall analysis revealed that each complex was stabilizing after 70 ns indicating better interaction with Nsp4 in terms of stability that is required for its inhibition. Further, MM-PBSA binding free energy and residual binding energy were calculated to assess the binding affinities of natural compounds with Nsp4. For determining the binding free energy of protein-ligand complexes by using MD simulation trajectory, it is a frequently used and well-accepted method [33,50]. The strength of the binding contacts between the ligand and the target protein is measured by ligand binding affinity, which is directly linked to ligand potency. In the field of drug discovery, its evaluation is crucial. Furthermore, in favorable reactions, the free energy is negative. So, lowering the binding energy improves interactions, and low binding energy corresponds to the high binding affinity of protein-ligand complexes [51]. MM-PBSA analysis demonstrated that the compounds ZINC38167083, ZINC16919178, ZINC08792350, and ZINC01510656 can act as a potential lead for inhibition of Nsp4 [52,53]. Whereas, ZINC08877407 was not recommended as a lead because their binding energy was found to be higher as compared to other compounds.
In past years, the identification of lead compounds for drug development take much time and cost as well as required good infrastructure experimental facilities [11,54]. Due to advances in structural biology, computer science, and bioinformatics, it becomes easy to find out putative molecule(s) by a screening of a big database that has a strong affinity with the target for experimental evaluation [24,55]. It saves the cost and time of the scientific community. Most of the medicines available in the market are from a natural source or it is a derivative of naturally occurring molecules [11,24]. Natural compounds have immense potential to inhibit virus and pathogenic proteins and act as antiviral drugs [56][57][58]. The results presented in this work are, therefore, informative for understanding the antiviral potential of suggested compounds as therapeutics for PRRSV. It might be also useful for the prevention of pigs and other animals from different viral diseases [59,60].

Conclusions
PRRSV infection is a main concern for the global swine industry, and there is a need to identify novel and effective therapeutic agents. Given the importance of Nsp4 in PRRSV replication and pathogenesis, we employed computational and MD approaches to screen and identify natural compounds as novel inhibitors of Nsp4 activity. The results identified four possible lead compounds that represent potentially effective drug-like inhibitors for application as antiviral therapeutics. Further studies are warranted to confirm these findings through experimental and clinical evaluations in order to promote future management of PRRSV infection. Availability of data and materials All data generated or analysed during this study are included in the manuscript.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.